#3¶

Kaggle competition: [link]

Entry by Robin R.P.M. Kras

⭐ 1. Introduction & Overview¶

Ask a home buyer to describe their dream house, and they probably won't begin with the height of the basement ceiling or the proximity to an east-west railroad. But this playground competition's dataset proves that much more influences price negotiations than the number of bedrooms or a white-picket fence.

With 79 explanatory variables describing (almost) every aspect of residential homes in Ames, Iowa, this competition challenges you to predict the final price of each home.

🔹 2. Import Libraries & Set Up¶

In [2]:
# General
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

# Machine Learning
import xgboost as xg
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score, mean_absolute_error, mean_squared_error, r2_score, root_mean_squared_error
from sklearn.model_selection import GridSearchCV
from sklearn.linear_model import LinearRegression

# Feature Importance & Explainability
import shap

# Settings
import warnings
warnings.filterwarnings("ignore")

# Set random seed for reproducibility
SEED = 42
np.random.seed(SEED)

print("Libraries loaded. Ready to go!")
Libraries loaded. Ready to go!
c:\Users\robkr\AppData\Local\Programs\Python\Python39\lib\site-packages\tqdm\auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html
  from .autonotebook import tqdm as notebook_tqdm

🔹 3. Load & Explore Data¶

In [3]:
train = pd.read_csv('train.csv')
test = pd.read_csv('test.csv')
In [4]:
train.head()
Out[4]:
Id MSSubClass MSZoning LotFrontage LotArea Street Alley LotShape LandContour Utilities ... PoolArea PoolQC Fence MiscFeature MiscVal MoSold YrSold SaleType SaleCondition SalePrice
0 1 60 RL 65.0 8450 Pave NaN Reg Lvl AllPub ... 0 NaN NaN NaN 0 2 2008 WD Normal 208500
1 2 20 RL 80.0 9600 Pave NaN Reg Lvl AllPub ... 0 NaN NaN NaN 0 5 2007 WD Normal 181500
2 3 60 RL 68.0 11250 Pave NaN IR1 Lvl AllPub ... 0 NaN NaN NaN 0 9 2008 WD Normal 223500
3 4 70 RL 60.0 9550 Pave NaN IR1 Lvl AllPub ... 0 NaN NaN NaN 0 2 2006 WD Abnorml 140000
4 5 60 RL 84.0 14260 Pave NaN IR1 Lvl AllPub ... 0 NaN NaN NaN 0 12 2008 WD Normal 250000

5 rows × 81 columns

In [5]:
train.shape
Out[5]:
(1460, 81)
In [6]:
train.isnull().sum()
Out[6]:
Id                 0
MSSubClass         0
MSZoning           0
LotFrontage      259
LotArea            0
                ... 
MoSold             0
YrSold             0
SaleType           0
SaleCondition      0
SalePrice          0
Length: 81, dtype: int64
In [7]:
# Quick summary of dataset
train.describe()
train.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1460 entries, 0 to 1459
Data columns (total 81 columns):
 #   Column         Non-Null Count  Dtype  
---  ------         --------------  -----  
 0   Id             1460 non-null   int64  
 1   MSSubClass     1460 non-null   int64  
 2   MSZoning       1460 non-null   object 
 3   LotFrontage    1201 non-null   float64
 4   LotArea        1460 non-null   int64  
 5   Street         1460 non-null   object 
 6   Alley          91 non-null     object 
 7   LotShape       1460 non-null   object 
 8   LandContour    1460 non-null   object 
 9   Utilities      1460 non-null   object 
 10  LotConfig      1460 non-null   object 
 11  LandSlope      1460 non-null   object 
 12  Neighborhood   1460 non-null   object 
 13  Condition1     1460 non-null   object 
 14  Condition2     1460 non-null   object 
 15  BldgType       1460 non-null   object 
 16  HouseStyle     1460 non-null   object 
 17  OverallQual    1460 non-null   int64  
 18  OverallCond    1460 non-null   int64  
 19  YearBuilt      1460 non-null   int64  
 20  YearRemodAdd   1460 non-null   int64  
 21  RoofStyle      1460 non-null   object 
 22  RoofMatl       1460 non-null   object 
 23  Exterior1st    1460 non-null   object 
 24  Exterior2nd    1460 non-null   object 
 25  MasVnrType     588 non-null    object 
 26  MasVnrArea     1452 non-null   float64
 27  ExterQual      1460 non-null   object 
 28  ExterCond      1460 non-null   object 
 29  Foundation     1460 non-null   object 
 30  BsmtQual       1423 non-null   object 
 31  BsmtCond       1423 non-null   object 
 32  BsmtExposure   1422 non-null   object 
 33  BsmtFinType1   1423 non-null   object 
 34  BsmtFinSF1     1460 non-null   int64  
 35  BsmtFinType2   1422 non-null   object 
 36  BsmtFinSF2     1460 non-null   int64  
 37  BsmtUnfSF      1460 non-null   int64  
 38  TotalBsmtSF    1460 non-null   int64  
 39  Heating        1460 non-null   object 
 40  HeatingQC      1460 non-null   object 
 41  CentralAir     1460 non-null   object 
 42  Electrical     1459 non-null   object 
 43  1stFlrSF       1460 non-null   int64  
 44  2ndFlrSF       1460 non-null   int64  
 45  LowQualFinSF   1460 non-null   int64  
 46  GrLivArea      1460 non-null   int64  
 47  BsmtFullBath   1460 non-null   int64  
 48  BsmtHalfBath   1460 non-null   int64  
 49  FullBath       1460 non-null   int64  
 50  HalfBath       1460 non-null   int64  
 51  BedroomAbvGr   1460 non-null   int64  
 52  KitchenAbvGr   1460 non-null   int64  
 53  KitchenQual    1460 non-null   object 
 54  TotRmsAbvGrd   1460 non-null   int64  
 55  Functional     1460 non-null   object 
 56  Fireplaces     1460 non-null   int64  
 57  FireplaceQu    770 non-null    object 
 58  GarageType     1379 non-null   object 
 59  GarageYrBlt    1379 non-null   float64
 60  GarageFinish   1379 non-null   object 
 61  GarageCars     1460 non-null   int64  
 62  GarageArea     1460 non-null   int64  
 63  GarageQual     1379 non-null   object 
 64  GarageCond     1379 non-null   object 
 65  PavedDrive     1460 non-null   object 
 66  WoodDeckSF     1460 non-null   int64  
 67  OpenPorchSF    1460 non-null   int64  
 68  EnclosedPorch  1460 non-null   int64  
 69  3SsnPorch      1460 non-null   int64  
 70  ScreenPorch    1460 non-null   int64  
 71  PoolArea       1460 non-null   int64  
 72  PoolQC         7 non-null      object 
 73  Fence          281 non-null    object 
 74  MiscFeature    54 non-null     object 
 75  MiscVal        1460 non-null   int64  
 76  MoSold         1460 non-null   int64  
 77  YrSold         1460 non-null   int64  
 78  SaleType       1460 non-null   object 
 79  SaleCondition  1460 non-null   object 
 80  SalePrice      1460 non-null   int64  
dtypes: float64(3), int64(35), object(43)
memory usage: 924.0+ KB

🔹 4. Data Visualization & EDA¶

In [8]:
float_cols = [col for col in train.columns if train[col].dtype == "float64"]

cols_per_row = 3
num_plots = len(float_cols)
rows = (num_plots // cols_per_row) + (num_plots % cols_per_row > 0) 

fig, axes = plt.subplots(rows, cols_per_row, figsize=(15, 5 * rows)) 
axes = axes.flatten()  

for idx, col in enumerate(float_cols):
    sns.histplot(train[col], bins=50, kde=True, ax=axes[idx])
    axes[idx].set_title(f"Distribution of {col}")

for i in range(idx + 1, len(axes)):
    fig.delaxes(axes[i])

plt.tight_layout()
plt.show()
No description has been provided for this image
In [9]:
categorical_features = train.select_dtypes(include=['object']).columns

num_features = len(categorical_features)
cols = 3 
rows = (num_features // cols) + (num_features % cols > 0) 

# Create subplots
fig, axes = plt.subplots(rows, cols, figsize=(15, rows * 5)) 
axes = axes.flatten()  

for i, feature in enumerate(categorical_features):
    train[feature].value_counts().plot.pie(
        autopct='%1.1f%%', ax=axes[i], startangle=90, cmap="viridis"
    )
    axes[i].set_title(feature)
    axes[i].set_ylabel("") 

# Hide any unused subplots
for j in range(i + 1, len(axes)):
    fig.delaxes(axes[j])

plt.tight_layout()
plt.show()
No description has been provided for this image
In [10]:
heatmap_train = pd.DataFrame()

for col in train.columns:
    if train[col].dtype == "float64" or train[col].dtype == "int64":
        heatmap_train[col] = train[col]

plt.figure(figsize=(30,12))
sns.heatmap(heatmap_train.corr(), annot=True, cmap="coolwarm")
plt.title("Feature Correlation Matrix")
plt.show()
No description has been provided for this image
In [11]:
heatmap_train = train.select_dtypes(include=["float64", "int64"])

corr_matrix = heatmap_train.corr()

threshold = 0.75

high_corr_pairs = (
    corr_matrix.where(np.triu(np.ones(corr_matrix.shape), k=1).astype(bool)) 
    .stack()  
    .reset_index()
)

high_corr_pairs.columns = ["Feature 1", "Feature 2", "Correlation"]
high_corr_pairs = high_corr_pairs[high_corr_pairs["Correlation"].abs() > threshold]  

plt.figure(figsize=(30, 12))
sns.heatmap(corr_matrix, annot=True, cmap="coolwarm")
plt.title("Feature Correlation Matrix")
plt.show()

print("Highly correlated feature pairs (above threshold):")
print(high_corr_pairs)
No description has been provided for this image
Highly correlated feature pairs (above threshold):
       Feature 1     Feature 2  Correlation
174  OverallQual     SalePrice     0.790982
225    YearBuilt   GarageYrBlt     0.825667
378  TotalBsmtSF      1stFlrSF     0.819530
478    GrLivArea  TotRmsAbvGrd     0.825489
637   GarageCars    GarageArea     0.882475
In [12]:
l1 = high_corr_pairs['Feature 1'].tolist()
l2 = high_corr_pairs['Feature 2'].tolist()
interesting_features = list(set(l1+l2))

interesting_features.remove('SalePrice')

print(interesting_features)
['GarageArea', 'GrLivArea', 'YearBuilt', 'OverallQual', 'TotRmsAbvGrd', '1stFlrSF', 'TotalBsmtSF', 'GarageCars', 'GarageYrBlt']

🔹 5. Feature Engineering¶

In [13]:
train.columns = train.columns.str.strip()
test.columns = test.columns.str.strip()
In [14]:
print(f"Train set, null count: \n{train.isnull().sum()}")
print("\n")
print(f"Test set, null count: \n{test.isnull().sum()}")
Train set, null count: 
Id                 0
MSSubClass         0
MSZoning           0
LotFrontage      259
LotArea            0
                ... 
MoSold             0
YrSold             0
SaleType           0
SaleCondition      0
SalePrice          0
Length: 81, dtype: int64


Test set, null count: 
Id                 0
MSSubClass         0
MSZoning           4
LotFrontage      227
LotArea            0
                ... 
MiscVal            0
MoSold             0
YrSold             0
SaleType           1
SaleCondition      0
Length: 80, dtype: int64
In [15]:
outliers = pd.concat([
    train[(train['OverallQual'] == 4) & (train['SalePrice'] > 2e5)],
    train[(train['OverallQual'] == 8) & (train['SalePrice'] > 5e5)],
    train[(train['OverallQual'] == 10) & (train['SalePrice'] > 7e5)],
    train[(train['GrLivArea'] > 4000)],
    train[(train['OverallCond'] == 2) & (train['SalePrice'] > 3e5)],
    train[(train['OverallCond'] == 5) & (train['SalePrice'] > 7e5)],
    train[(train['OverallCond'] == 6) & (train['SalePrice'] > 7e5)]

    ]).sort_index().drop_duplicates()
In [16]:
train = train.drop(outliers.index)
In [17]:
train["LotFrontage"] = train.groupby("Neighborhood")["LotFrontage"].transform(
    lambda x: x.fillna(x.median()))
test["LotFrontage"] = test.groupby("Neighborhood")["LotFrontage"].transform(
    lambda x: x.fillna(x.median()))

for col in ('GarageType', 'GarageFinish', 'GarageQual', 'GarageCond'):
    train[col] = train[col].fillna('None')
    test[col] = test[col].fillna('None')

for col in ('GarageYrBlt', 'GarageArea', 'GarageCars'):
    train[col] = train[col].fillna(0)
    test[col] = test[col].fillna(0)

train['TotalSF'] = train['TotalBsmtSF'] + train['1stFlrSF'] + train['2ndFlrSF']
test['TotalSF'] = test['TotalBsmtSF'] + test['1stFlrSF'] + test['2ndFlrSF']
In [18]:
for col in train.columns:
    if train[col].dtype == "object":
        train[col] = train[col].fillna("None")
    elif train[col].dtype in ["float64", "int64"]:
        train[col] = train[col].fillna(train[col].mean())

for col in test.columns:
    if test[col].dtype == "object":
        test[col] = test[col].fillna("None")
    elif test[col].dtype in ["float64", "int64"]:
        test[col] = test[col].fillna(test[col].mean())    
In [19]:
for col in train.columns:
    if train[col].isnull().sum() > 0:
        print(col)

for col in test.columns:
    if test[col].isnull().sum() > 0:
        print(col)

No more empty items left. Great!

In [20]:
import itertools

def create_combination_features(df, features):
    combinations = itertools.combinations(features, 2)

    for comb in combinations:
        feature_name = "_".join(comb)
        df[feature_name] = df[list(comb)].mean(axis=1)
    
    return df

train = create_combination_features(train, interesting_features)
test = create_combination_features(test, interesting_features)
In [21]:
train.head()
Out[21]:
Id MSSubClass MSZoning LotFrontage LotArea Street Alley LotShape LandContour Utilities ... TotRmsAbvGrd_1stFlrSF TotRmsAbvGrd_TotalBsmtSF TotRmsAbvGrd_GarageCars TotRmsAbvGrd_GarageYrBlt 1stFlrSF_TotalBsmtSF 1stFlrSF_GarageCars 1stFlrSF_GarageYrBlt TotalBsmtSF_GarageCars TotalBsmtSF_GarageYrBlt GarageCars_GarageYrBlt
0 1 60 RL 65.0 8450 Pave None Reg Lvl AllPub ... 432.0 432.0 5.0 1005.5 856.0 429.0 1429.5 429.0 1429.5 1002.5
1 2 20 RL 80.0 9600 Pave None Reg Lvl AllPub ... 634.0 634.0 4.0 991.0 1262.0 632.0 1619.0 632.0 1619.0 989.0
2 3 60 RL 68.0 11250 Pave None IR1 Lvl AllPub ... 463.0 463.0 4.0 1003.5 920.0 461.0 1460.5 461.0 1460.5 1001.5
3 4 70 RL 60.0 9550 Pave None IR1 Lvl AllPub ... 484.0 381.5 5.0 1002.5 858.5 482.0 1479.5 379.5 1377.0 1000.5
4 5 60 RL 84.0 14260 Pave None IR1 Lvl AllPub ... 577.0 577.0 6.0 1004.5 1145.0 574.0 1572.5 574.0 1572.5 1001.5

5 rows × 118 columns

In [22]:
from sklearn.preprocessing import LabelEncoder

le = LabelEncoder()

for col in train.columns:
    if train[col].dtype == "object":
        train[col] = le.fit_transform(train[col])

for col in test.columns:
    if test[col].dtype == "object":
        test[col] = le.fit_transform(test[col])

🔹 6. Model Selection¶

In [23]:
X = train.drop(columns=["Id", "SalePrice"])
X_test = test.drop(columns=["Id"])

y = train['SalePrice']

X_train, X_val, y_train, y_val = train_test_split(X, y, test_size=0.2, random_state=SEED)
In [ ]:
param_grid = {
    'n_estimators': [100, 200, 500],  
    'learning_rate': [0.01, 0.05, 0.1],  
    'max_depth': [3, 5, 7, 9],  
    'subsample': [0.8, 0.9, 1.0], 
    'colsample_bytree': [0.8, 0.9, 1.0],
    'alpha': [0, 0.01, 0.1, 1],
    'lambda': [0, 0.1, 0.5, 1],
    'gamma': [0, 0.1, 0.2, 1],
    'early_stopping_rounds': [5, 10, 20, 30]
}

grid_search = GridSearchCV(xg.XGBRegressor(tree_method="gpu_hist", random_state=SEED), param_grid, cv=5, n_jobs=-1)
grid_search.fit(X_train, y_train, 
            eval_set=[(X_train, y_train), (X_val, y_val)])

print("Best Parameters:", grid_search.best_params_)

best_params = grid_search.best_params_
In [ ]:
model = xg.XGBRegressor(
    learning_rate=0.1,
    max_depth=6,
    early_stopping_rounds=30,
    n_estimators=200,
    random_state=SEED)

model.fit(X_train, y_train, 
            eval_set=[(X_train, y_train), (X_val, y_val)])

results = model.evals_result()

plt.figure(figsize=(10,7))
plt.plot(results["validation_0"]["rmse"], label="Training loss")
plt.plot(results["validation_1"]["rmse"], label="Validation loss")
plt.xlabel("Number of trees")
plt.ylabel("Loss")
plt.legend()

predictions = model.predict(X_val)

mse = mean_squared_error(y_val, predictions)
mae = mean_absolute_error(y_val, predictions)
r2 = r2_score(y_val, predictions)
rms = root_mean_squared_error(y_val, predictions)

print(f"Mean Squared Error: {mse}")
print(f"Mean Absolute Error: {mae}")
print(f"R² Score: {r2}")
print(f"RMSE Score: {rms}")
[0]	validation_0-rmse:68200.58338	validation_1-rmse:74748.47981
[1]	validation_0-rmse:62443.43453	validation_1-rmse:69041.04398
[2]	validation_0-rmse:57236.27183	validation_1-rmse:63616.72024
[3]	validation_0-rmse:52587.11351	validation_1-rmse:59081.66027
[4]	validation_0-rmse:48370.67513	validation_1-rmse:54924.25403
[5]	validation_0-rmse:44498.31457	validation_1-rmse:51133.11870
[6]	validation_0-rmse:41024.99419	validation_1-rmse:47691.95186
[7]	validation_0-rmse:37906.89800	validation_1-rmse:44860.43264
[8]	validation_0-rmse:35028.33049	validation_1-rmse:42103.81330
[9]	validation_0-rmse:32460.44484	validation_1-rmse:39631.82414
[10]	validation_0-rmse:30153.72735	validation_1-rmse:37490.13745
[11]	validation_0-rmse:28018.28846	validation_1-rmse:35783.47624
[12]	validation_0-rmse:26068.11098	validation_1-rmse:34137.86958
[13]	validation_0-rmse:24279.05849	validation_1-rmse:32595.22585
[14]	validation_0-rmse:22734.49950	validation_1-rmse:31368.83929
[15]	validation_0-rmse:21286.99232	validation_1-rmse:30247.44398
[16]	validation_0-rmse:20004.82806	validation_1-rmse:29243.53915
[17]	validation_0-rmse:18792.56947	validation_1-rmse:28477.47856
[18]	validation_0-rmse:17723.94309	validation_1-rmse:27697.73799
[19]	validation_0-rmse:16715.98712	validation_1-rmse:27078.37962
[20]	validation_0-rmse:15805.58841	validation_1-rmse:26546.09815
[21]	validation_0-rmse:14976.18381	validation_1-rmse:26103.68175
[22]	validation_0-rmse:14222.14456	validation_1-rmse:25645.09853
[23]	validation_0-rmse:13543.24571	validation_1-rmse:25210.17346
[24]	validation_0-rmse:12915.49834	validation_1-rmse:24815.95216
[25]	validation_0-rmse:12392.50604	validation_1-rmse:24544.73992
[26]	validation_0-rmse:11839.24454	validation_1-rmse:24286.02965
[27]	validation_0-rmse:11344.63056	validation_1-rmse:24081.26436
[28]	validation_0-rmse:10907.26205	validation_1-rmse:23975.73128
[29]	validation_0-rmse:10497.59865	validation_1-rmse:23798.71601
[30]	validation_0-rmse:10136.52198	validation_1-rmse:23663.76441
[31]	validation_0-rmse:9786.47261	validation_1-rmse:23560.58570
[32]	validation_0-rmse:9490.53794	validation_1-rmse:23428.10953
[33]	validation_0-rmse:9190.73410	validation_1-rmse:23399.25107
[34]	validation_0-rmse:8912.92467	validation_1-rmse:23312.03722
[35]	validation_0-rmse:8662.71903	validation_1-rmse:23277.95567
[36]	validation_0-rmse:8415.97257	validation_1-rmse:23187.37876
[37]	validation_0-rmse:8213.61413	validation_1-rmse:23140.39297
[38]	validation_0-rmse:8029.32556	validation_1-rmse:23099.18223
[39]	validation_0-rmse:7849.74029	validation_1-rmse:23016.18234
[40]	validation_0-rmse:7694.74838	validation_1-rmse:22978.92911
[41]	validation_0-rmse:7550.55924	validation_1-rmse:22941.79753
[42]	validation_0-rmse:7415.47416	validation_1-rmse:22917.43016
[43]	validation_0-rmse:7233.94066	validation_1-rmse:22888.17298
[44]	validation_0-rmse:7107.16350	validation_1-rmse:22864.21905
[45]	validation_0-rmse:6990.62495	validation_1-rmse:22838.42937
[46]	validation_0-rmse:6863.23131	validation_1-rmse:22806.85921
[47]	validation_0-rmse:6756.28615	validation_1-rmse:22809.91275
[48]	validation_0-rmse:6628.54699	validation_1-rmse:22775.16561
[49]	validation_0-rmse:6558.98740	validation_1-rmse:22751.30358
[50]	validation_0-rmse:6462.04591	validation_1-rmse:22743.73390
[51]	validation_0-rmse:6400.26810	validation_1-rmse:22738.44336
[52]	validation_0-rmse:6305.39724	validation_1-rmse:22709.60331
[53]	validation_0-rmse:6227.84413	validation_1-rmse:22698.41363
[54]	validation_0-rmse:6163.49545	validation_1-rmse:22690.39752
[55]	validation_0-rmse:6093.89034	validation_1-rmse:22700.06436
[56]	validation_0-rmse:6042.18483	validation_1-rmse:22694.13922
[57]	validation_0-rmse:5962.26495	validation_1-rmse:22697.39284
[58]	validation_0-rmse:5909.26601	validation_1-rmse:22686.58611
[59]	validation_0-rmse:5848.75533	validation_1-rmse:22677.84031
[60]	validation_0-rmse:5737.69466	validation_1-rmse:22673.41461
[61]	validation_0-rmse:5679.46694	validation_1-rmse:22671.89026
[62]	validation_0-rmse:5633.50201	validation_1-rmse:22647.33468
[63]	validation_0-rmse:5561.07709	validation_1-rmse:22639.73031
[64]	validation_0-rmse:5493.90501	validation_1-rmse:22626.82699
[65]	validation_0-rmse:5440.07701	validation_1-rmse:22611.40820
[66]	validation_0-rmse:5388.71456	validation_1-rmse:22603.03757
[67]	validation_0-rmse:5351.92950	validation_1-rmse:22586.74834
[68]	validation_0-rmse:5334.77985	validation_1-rmse:22591.46256
[69]	validation_0-rmse:5309.01887	validation_1-rmse:22580.62633
[70]	validation_0-rmse:5278.81464	validation_1-rmse:22577.22062
[71]	validation_0-rmse:5240.94052	validation_1-rmse:22583.64358
[72]	validation_0-rmse:5170.94148	validation_1-rmse:22574.93638
[73]	validation_0-rmse:5149.99150	validation_1-rmse:22580.30892
[74]	validation_0-rmse:5061.33582	validation_1-rmse:22557.66271
[75]	validation_0-rmse:5007.29460	validation_1-rmse:22548.18021
[76]	validation_0-rmse:4949.39206	validation_1-rmse:22539.49561
[77]	validation_0-rmse:4910.19003	validation_1-rmse:22529.71710
[78]	validation_0-rmse:4876.81530	validation_1-rmse:22526.76133
[79]	validation_0-rmse:4847.60726	validation_1-rmse:22521.28373
[80]	validation_0-rmse:4814.52352	validation_1-rmse:22528.42633
[81]	validation_0-rmse:4783.73140	validation_1-rmse:22510.88059
[82]	validation_0-rmse:4762.07653	validation_1-rmse:22504.60449
[83]	validation_0-rmse:4742.03751	validation_1-rmse:22517.56550
[84]	validation_0-rmse:4685.58208	validation_1-rmse:22508.63493
[85]	validation_0-rmse:4636.69249	validation_1-rmse:22514.28647
[86]	validation_0-rmse:4589.15833	validation_1-rmse:22502.00722
[87]	validation_0-rmse:4528.81046	validation_1-rmse:22494.10020
[88]	validation_0-rmse:4468.33393	validation_1-rmse:22491.95570
[89]	validation_0-rmse:4429.13083	validation_1-rmse:22491.72224
[90]	validation_0-rmse:4419.28827	validation_1-rmse:22486.29819
[91]	validation_0-rmse:4388.53716	validation_1-rmse:22475.19923
[92]	validation_0-rmse:4369.33362	validation_1-rmse:22464.17931
[93]	validation_0-rmse:4332.07841	validation_1-rmse:22444.07426
[94]	validation_0-rmse:4307.27544	validation_1-rmse:22438.28036
[95]	validation_0-rmse:4293.14609	validation_1-rmse:22441.53321
[96]	validation_0-rmse:4268.19973	validation_1-rmse:22428.84079
[97]	validation_0-rmse:4223.40667	validation_1-rmse:22454.66255
[98]	validation_0-rmse:4174.37167	validation_1-rmse:22449.13863
[99]	validation_0-rmse:4143.30864	validation_1-rmse:22444.45430
[100]	validation_0-rmse:4116.23088	validation_1-rmse:22450.89118
[101]	validation_0-rmse:4092.70500	validation_1-rmse:22453.31643
[102]	validation_0-rmse:4065.31134	validation_1-rmse:22454.54505
[103]	validation_0-rmse:3989.11927	validation_1-rmse:22472.55710
[104]	validation_0-rmse:3964.78741	validation_1-rmse:22471.85230
[105]	validation_0-rmse:3942.83009	validation_1-rmse:22471.92748
[106]	validation_0-rmse:3927.99931	validation_1-rmse:22463.35239
[107]	validation_0-rmse:3905.65742	validation_1-rmse:22464.17778
[108]	validation_0-rmse:3850.61132	validation_1-rmse:22476.56106
[109]	validation_0-rmse:3815.37231	validation_1-rmse:22478.70581
[110]	validation_0-rmse:3808.47856	validation_1-rmse:22477.69142
[111]	validation_0-rmse:3801.86568	validation_1-rmse:22478.73314
[112]	validation_0-rmse:3752.40206	validation_1-rmse:22482.93568
[113]	validation_0-rmse:3712.68559	validation_1-rmse:22495.86470
[114]	validation_0-rmse:3687.80951	validation_1-rmse:22492.26475
[115]	validation_0-rmse:3674.59395	validation_1-rmse:22490.28163
[116]	validation_0-rmse:3638.04632	validation_1-rmse:22487.89246
[117]	validation_0-rmse:3600.58375	validation_1-rmse:22487.42456
[118]	validation_0-rmse:3549.39242	validation_1-rmse:22487.26057
[119]	validation_0-rmse:3478.18111	validation_1-rmse:22485.70086
[120]	validation_0-rmse:3435.82543	validation_1-rmse:22475.09397
[121]	validation_0-rmse:3417.94290	validation_1-rmse:22474.60471
[122]	validation_0-rmse:3368.31666	validation_1-rmse:22465.84511
[123]	validation_0-rmse:3340.24715	validation_1-rmse:22459.90657
[124]	validation_0-rmse:3324.01229	validation_1-rmse:22464.60517
[125]	validation_0-rmse:3313.15618	validation_1-rmse:22464.80807
[126]	validation_0-rmse:3289.01327	validation_1-rmse:22461.29045
Mean Squared Error: 503052898.9427562
Mean Absolute Error: 15613.961326782646
R² Score: 0.923322856426239
RMSE Score: 22428.840784640568
No description has been provided for this image
In [ ]:
X_test = test.drop(columns=['Id'])  

predictions = model.predict(X_test)  

output = pd.DataFrame({'Id': test['Id'], 'SalePrice': predictions})
output.to_csv('submission_xgb.csv', index=False)
print("Your submission was successfully saved!")
Your submission was successfully saved!

🔹 Experiment¶

In [ ]:
y = train["SalePrice"]

X = pd.get_dummies(train.drop(columns=["SalePrice"]))
X_test = pd.get_dummies(test)

X_train, X_val, y_train, y_val = train_test_split(X, y, test_size=0.25, random_state=SEED)

model = xg.XGBRegressor(
    n_estimators=500, 
    learning_rate=0.1, 
    max_depth=6,
    early_stopping_rounds=20,
    alpha=1,
    lambda_=1,
    gamma=0.1,
    random_state=SEED)
    
model.fit(X, y, 
            eval_set=[(X_train, y_train), (X_val, y_val)])

results = model.evals_result()

plt.figure(figsize=(10,7))
plt.plot(results["validation_0"]["rmse"], label="Training loss")
plt.plot(results["validation_1"]["rmse"], label="Validation loss")
plt.xlabel("Number of trees")
plt.ylabel("Loss")
plt.legend()

rms = min(results["validation_1"].values(), key=min)
rms = min(rms)

predictions = model.predict(X_test)
predictions_val = model.predict(X_val)

print(f"RMSE Score: {rms}")

output = pd.DataFrame({'Id': test['Id'], 'SalePrice': predictions})
output.to_csv('submission_experiment.csv', index=False)
print("Your submission was successfully saved!")
[0]	validation_0-rmse:68692.95549	validation_1-rmse:71468.80278
[1]	validation_0-rmse:62918.01937	validation_1-rmse:65195.72850
[2]	validation_0-rmse:57649.09855	validation_1-rmse:59810.39291
[3]	validation_0-rmse:52956.96228	validation_1-rmse:54739.63582
[4]	validation_0-rmse:48705.10215	validation_1-rmse:50144.94079
[5]	validation_0-rmse:44867.54684	validation_1-rmse:46121.25656
[6]	validation_0-rmse:41466.39687	validation_1-rmse:42394.29842
[7]	validation_0-rmse:38320.51782	validation_1-rmse:39162.25955
[8]	validation_0-rmse:35384.73829	validation_1-rmse:36240.14309
[9]	validation_0-rmse:32804.86725	validation_1-rmse:33493.54485
[10]	validation_0-rmse:30405.63712	validation_1-rmse:31062.94576
[11]	validation_0-rmse:28255.76638	validation_1-rmse:28898.78443
[12]	validation_0-rmse:26337.84239	validation_1-rmse:26989.61587
[13]	validation_0-rmse:24599.30967	validation_1-rmse:25094.60023
[14]	validation_0-rmse:23018.69847	validation_1-rmse:23472.90585
[15]	validation_0-rmse:21610.96075	validation_1-rmse:21970.20603
[16]	validation_0-rmse:20306.01875	validation_1-rmse:20700.06238
[17]	validation_0-rmse:19121.14860	validation_1-rmse:19578.55128
[18]	validation_0-rmse:18085.23365	validation_1-rmse:18455.03214
[19]	validation_0-rmse:17128.34156	validation_1-rmse:17490.29288
[20]	validation_0-rmse:16221.74241	validation_1-rmse:16647.92451
[21]	validation_0-rmse:15490.00403	validation_1-rmse:15853.18632
[22]	validation_0-rmse:14741.48011	validation_1-rmse:15052.13969
[23]	validation_0-rmse:14092.59564	validation_1-rmse:14374.69408
[24]	validation_0-rmse:13477.69231	validation_1-rmse:13759.96050
[25]	validation_0-rmse:12940.11753	validation_1-rmse:13202.79978
[26]	validation_0-rmse:12458.20172	validation_1-rmse:12747.76060
[27]	validation_0-rmse:11947.72949	validation_1-rmse:12324.34023
[28]	validation_0-rmse:11527.52911	validation_1-rmse:11845.37914
[29]	validation_0-rmse:11151.65115	validation_1-rmse:11462.01648
[30]	validation_0-rmse:10799.26118	validation_1-rmse:11088.44583
[31]	validation_0-rmse:10468.57123	validation_1-rmse:10778.66189
[32]	validation_0-rmse:10158.36147	validation_1-rmse:10473.13564
[33]	validation_0-rmse:9884.26067	validation_1-rmse:10240.30621
[34]	validation_0-rmse:9648.65134	validation_1-rmse:9967.58628
[35]	validation_0-rmse:9400.60138	validation_1-rmse:9764.20274
[36]	validation_0-rmse:9200.18080	validation_1-rmse:9522.96153
[37]	validation_0-rmse:9027.59931	validation_1-rmse:9308.54452
[38]	validation_0-rmse:8844.49687	validation_1-rmse:9080.31323
[39]	validation_0-rmse:8689.45924	validation_1-rmse:8916.79135
[40]	validation_0-rmse:8517.24712	validation_1-rmse:8773.22911
[41]	validation_0-rmse:8383.93332	validation_1-rmse:8604.41454
[42]	validation_0-rmse:8262.20377	validation_1-rmse:8471.49163
[43]	validation_0-rmse:8143.78445	validation_1-rmse:8370.99911
[44]	validation_0-rmse:8023.18485	validation_1-rmse:8235.39855
[45]	validation_0-rmse:7910.34506	validation_1-rmse:8188.71055
[46]	validation_0-rmse:7772.96058	validation_1-rmse:8088.99451
[47]	validation_0-rmse:7662.19374	validation_1-rmse:7943.93441
[48]	validation_0-rmse:7546.83082	validation_1-rmse:7843.88672
[49]	validation_0-rmse:7457.34589	validation_1-rmse:7776.78279
[50]	validation_0-rmse:7348.29445	validation_1-rmse:7695.67272
[51]	validation_0-rmse:7251.09694	validation_1-rmse:7631.51805
[52]	validation_0-rmse:7188.93662	validation_1-rmse:7558.35071
[53]	validation_0-rmse:7100.25003	validation_1-rmse:7453.17204
[54]	validation_0-rmse:7014.86600	validation_1-rmse:7385.38071
[55]	validation_0-rmse:6960.12197	validation_1-rmse:7347.49498
[56]	validation_0-rmse:6852.86376	validation_1-rmse:7253.59416
[57]	validation_0-rmse:6765.16585	validation_1-rmse:7196.55055
[58]	validation_0-rmse:6725.84972	validation_1-rmse:7151.26483
[59]	validation_0-rmse:6655.74194	validation_1-rmse:7095.22968
[60]	validation_0-rmse:6610.68962	validation_1-rmse:7064.37130
[61]	validation_0-rmse:6538.06282	validation_1-rmse:6929.95033
[62]	validation_0-rmse:6483.79752	validation_1-rmse:6888.60951
[63]	validation_0-rmse:6419.52527	validation_1-rmse:6795.00007
[64]	validation_0-rmse:6388.56361	validation_1-rmse:6754.16475
[65]	validation_0-rmse:6309.72396	validation_1-rmse:6636.81957
[66]	validation_0-rmse:6287.47980	validation_1-rmse:6594.94820
[67]	validation_0-rmse:6235.47968	validation_1-rmse:6545.26632
[68]	validation_0-rmse:6181.80336	validation_1-rmse:6446.66294
[69]	validation_0-rmse:6133.34182	validation_1-rmse:6398.79432
[70]	validation_0-rmse:6050.54992	validation_1-rmse:6297.11522
[71]	validation_0-rmse:6001.35172	validation_1-rmse:6223.87198
[72]	validation_0-rmse:5972.86783	validation_1-rmse:6176.52007
[73]	validation_0-rmse:5939.09047	validation_1-rmse:6145.73494
[74]	validation_0-rmse:5909.37837	validation_1-rmse:6121.90226
[75]	validation_0-rmse:5871.33367	validation_1-rmse:6071.69788
[76]	validation_0-rmse:5819.04676	validation_1-rmse:6008.06799
[77]	validation_0-rmse:5786.09140	validation_1-rmse:5982.34667
[78]	validation_0-rmse:5761.63401	validation_1-rmse:5957.34271
[79]	validation_0-rmse:5726.55094	validation_1-rmse:5933.80194
[80]	validation_0-rmse:5707.30860	validation_1-rmse:5920.25232
[81]	validation_0-rmse:5673.08525	validation_1-rmse:5861.75651
[82]	validation_0-rmse:5644.59030	validation_1-rmse:5837.39821
[83]	validation_0-rmse:5571.53892	validation_1-rmse:5785.39989
[84]	validation_0-rmse:5557.65609	validation_1-rmse:5774.73530
[85]	validation_0-rmse:5518.95385	validation_1-rmse:5756.77192
[86]	validation_0-rmse:5499.40318	validation_1-rmse:5735.14091
[87]	validation_0-rmse:5411.58727	validation_1-rmse:5649.58908
[88]	validation_0-rmse:5388.63187	validation_1-rmse:5619.67225
[89]	validation_0-rmse:5340.95007	validation_1-rmse:5583.82299
[90]	validation_0-rmse:5314.32655	validation_1-rmse:5540.40790
[91]	validation_0-rmse:5284.67093	validation_1-rmse:5513.91732
[92]	validation_0-rmse:5266.65486	validation_1-rmse:5500.34474
[93]	validation_0-rmse:5219.64997	validation_1-rmse:5456.66558
[94]	validation_0-rmse:5192.53637	validation_1-rmse:5408.86700
[95]	validation_0-rmse:5133.05640	validation_1-rmse:5344.91401
[96]	validation_0-rmse:5091.72849	validation_1-rmse:5318.07966
[97]	validation_0-rmse:5083.26861	validation_1-rmse:5309.66056
[98]	validation_0-rmse:5034.60158	validation_1-rmse:5269.36045
[99]	validation_0-rmse:5004.63890	validation_1-rmse:5251.44962
[100]	validation_0-rmse:4962.09687	validation_1-rmse:5209.57026
[101]	validation_0-rmse:4880.85993	validation_1-rmse:5124.50647
[102]	validation_0-rmse:4867.50537	validation_1-rmse:5107.20070
[103]	validation_0-rmse:4825.22769	validation_1-rmse:5048.80260
[104]	validation_0-rmse:4798.14509	validation_1-rmse:5015.46801
[105]	validation_0-rmse:4780.11509	validation_1-rmse:4995.57538
[106]	validation_0-rmse:4743.32533	validation_1-rmse:4976.83555
[107]	validation_0-rmse:4707.96210	validation_1-rmse:4932.79517
[108]	validation_0-rmse:4684.82369	validation_1-rmse:4906.61174
[109]	validation_0-rmse:4669.32489	validation_1-rmse:4894.80115
[110]	validation_0-rmse:4648.47558	validation_1-rmse:4867.41833
[111]	validation_0-rmse:4632.27058	validation_1-rmse:4856.64312
[112]	validation_0-rmse:4593.93901	validation_1-rmse:4806.84769
[113]	validation_0-rmse:4573.09374	validation_1-rmse:4764.84422
[114]	validation_0-rmse:4527.65565	validation_1-rmse:4722.94548
[115]	validation_0-rmse:4511.22352	validation_1-rmse:4696.47360
[116]	validation_0-rmse:4504.65021	validation_1-rmse:4692.26253
[117]	validation_0-rmse:4482.58813	validation_1-rmse:4660.15526
[118]	validation_0-rmse:4440.44188	validation_1-rmse:4625.06221
[119]	validation_0-rmse:4401.07534	validation_1-rmse:4555.22122
[120]	validation_0-rmse:4375.91212	validation_1-rmse:4532.83982
[121]	validation_0-rmse:4362.02334	validation_1-rmse:4520.30888
[122]	validation_0-rmse:4310.68478	validation_1-rmse:4494.70211
[123]	validation_0-rmse:4304.16382	validation_1-rmse:4488.36688
[124]	validation_0-rmse:4286.57862	validation_1-rmse:4473.80531
[125]	validation_0-rmse:4256.10948	validation_1-rmse:4452.32571
[126]	validation_0-rmse:4206.46423	validation_1-rmse:4391.63304
[127]	validation_0-rmse:4198.29688	validation_1-rmse:4381.72428
[128]	validation_0-rmse:4189.90499	validation_1-rmse:4372.74871
[129]	validation_0-rmse:4162.58917	validation_1-rmse:4352.94137
[130]	validation_0-rmse:4093.66178	validation_1-rmse:4262.74574
[131]	validation_0-rmse:4059.62437	validation_1-rmse:4227.72002
[132]	validation_0-rmse:4030.86804	validation_1-rmse:4203.47957
[133]	validation_0-rmse:3989.46765	validation_1-rmse:4142.86687
[134]	validation_0-rmse:3971.59394	validation_1-rmse:4118.98935
[135]	validation_0-rmse:3941.39358	validation_1-rmse:4087.83450
[136]	validation_0-rmse:3916.92955	validation_1-rmse:4069.37775
[137]	validation_0-rmse:3883.06364	validation_1-rmse:4036.91030
[138]	validation_0-rmse:3857.28052	validation_1-rmse:4012.96331
[139]	validation_0-rmse:3832.65470	validation_1-rmse:3992.91194
[140]	validation_0-rmse:3811.67945	validation_1-rmse:3959.96831
[141]	validation_0-rmse:3792.40626	validation_1-rmse:3943.16434
[142]	validation_0-rmse:3754.45763	validation_1-rmse:3911.89734
[143]	validation_0-rmse:3715.75888	validation_1-rmse:3870.17471
[144]	validation_0-rmse:3684.16176	validation_1-rmse:3833.29216
[145]	validation_0-rmse:3677.25588	validation_1-rmse:3821.40331
[146]	validation_0-rmse:3651.65554	validation_1-rmse:3790.28079
[147]	validation_0-rmse:3633.49181	validation_1-rmse:3777.24697
[148]	validation_0-rmse:3607.61596	validation_1-rmse:3755.02797
[149]	validation_0-rmse:3556.16950	validation_1-rmse:3714.47320
[150]	validation_0-rmse:3505.58328	validation_1-rmse:3682.20823
[151]	validation_0-rmse:3469.75837	validation_1-rmse:3646.60189
[152]	validation_0-rmse:3450.53201	validation_1-rmse:3626.07031
[153]	validation_0-rmse:3436.87437	validation_1-rmse:3606.92679
[154]	validation_0-rmse:3383.82553	validation_1-rmse:3546.47829
[155]	validation_0-rmse:3352.20780	validation_1-rmse:3523.10795
[156]	validation_0-rmse:3332.03618	validation_1-rmse:3497.98649
[157]	validation_0-rmse:3308.03747	validation_1-rmse:3465.32892
[158]	validation_0-rmse:3297.71883	validation_1-rmse:3452.79150
[159]	validation_0-rmse:3270.69566	validation_1-rmse:3429.91629
[160]	validation_0-rmse:3252.12357	validation_1-rmse:3397.87511
[161]	validation_0-rmse:3244.75330	validation_1-rmse:3389.04678
[162]	validation_0-rmse:3214.36765	validation_1-rmse:3361.39259
[163]	validation_0-rmse:3201.89313	validation_1-rmse:3347.11517
[164]	validation_0-rmse:3191.04734	validation_1-rmse:3339.01151
[165]	validation_0-rmse:3178.31971	validation_1-rmse:3329.17389
[166]	validation_0-rmse:3153.79842	validation_1-rmse:3301.91703
[167]	validation_0-rmse:3118.57234	validation_1-rmse:3267.57567
[168]	validation_0-rmse:3112.37739	validation_1-rmse:3260.07710
[169]	validation_0-rmse:3103.58519	validation_1-rmse:3246.50681
[170]	validation_0-rmse:3070.43609	validation_1-rmse:3212.66540
[171]	validation_0-rmse:3045.81707	validation_1-rmse:3192.15865
[172]	validation_0-rmse:3036.13939	validation_1-rmse:3178.55781
[173]	validation_0-rmse:3029.64341	validation_1-rmse:3173.35585
[174]	validation_0-rmse:3007.84150	validation_1-rmse:3142.53606
[175]	validation_0-rmse:2983.89869	validation_1-rmse:3127.31516
[176]	validation_0-rmse:2962.92805	validation_1-rmse:3114.81735
[177]	validation_0-rmse:2957.30454	validation_1-rmse:3102.99748
[178]	validation_0-rmse:2948.08769	validation_1-rmse:3084.26202
[179]	validation_0-rmse:2908.41073	validation_1-rmse:3043.08806
[180]	validation_0-rmse:2872.58633	validation_1-rmse:3006.98814
[181]	validation_0-rmse:2843.92082	validation_1-rmse:2969.65899
[182]	validation_0-rmse:2815.81727	validation_1-rmse:2948.44293
[183]	validation_0-rmse:2795.86647	validation_1-rmse:2925.16811
[184]	validation_0-rmse:2775.38880	validation_1-rmse:2901.38680
[185]	validation_0-rmse:2770.14958	validation_1-rmse:2895.65820
[186]	validation_0-rmse:2756.46737	validation_1-rmse:2880.46523
[187]	validation_0-rmse:2731.59887	validation_1-rmse:2863.59571
[188]	validation_0-rmse:2726.25454	validation_1-rmse:2858.93024
[189]	validation_0-rmse:2722.89032	validation_1-rmse:2854.07023
[190]	validation_0-rmse:2703.21909	validation_1-rmse:2843.41516
[191]	validation_0-rmse:2666.96916	validation_1-rmse:2813.48565
[192]	validation_0-rmse:2657.39671	validation_1-rmse:2807.17380
[193]	validation_0-rmse:2631.07609	validation_1-rmse:2795.73868
[194]	validation_0-rmse:2627.08515	validation_1-rmse:2789.82928
[195]	validation_0-rmse:2596.98676	validation_1-rmse:2760.34264
[196]	validation_0-rmse:2581.17209	validation_1-rmse:2744.67097
[197]	validation_0-rmse:2563.99855	validation_1-rmse:2722.29052
[198]	validation_0-rmse:2551.79371	validation_1-rmse:2712.75076
[199]	validation_0-rmse:2544.56748	validation_1-rmse:2701.96722
[200]	validation_0-rmse:2539.48581	validation_1-rmse:2694.43937
[201]	validation_0-rmse:2527.37617	validation_1-rmse:2684.46242
[202]	validation_0-rmse:2516.34169	validation_1-rmse:2671.59032
[203]	validation_0-rmse:2497.38062	validation_1-rmse:2647.93834
[204]	validation_0-rmse:2470.18563	validation_1-rmse:2617.45114
[205]	validation_0-rmse:2466.12118	validation_1-rmse:2613.83596
[206]	validation_0-rmse:2458.55428	validation_1-rmse:2604.49683
[207]	validation_0-rmse:2429.39420	validation_1-rmse:2575.20808
[208]	validation_0-rmse:2410.46093	validation_1-rmse:2555.95691
[209]	validation_0-rmse:2391.30818	validation_1-rmse:2545.06835
[210]	validation_0-rmse:2380.23649	validation_1-rmse:2529.43071
[211]	validation_0-rmse:2378.64755	validation_1-rmse:2527.01606
[212]	validation_0-rmse:2366.71687	validation_1-rmse:2509.57337
[213]	validation_0-rmse:2357.16115	validation_1-rmse:2492.69042
[214]	validation_0-rmse:2343.17509	validation_1-rmse:2478.40587
[215]	validation_0-rmse:2327.72756	validation_1-rmse:2467.82424
[216]	validation_0-rmse:2313.78401	validation_1-rmse:2455.58457
[217]	validation_0-rmse:2307.78238	validation_1-rmse:2447.42696
[218]	validation_0-rmse:2298.94524	validation_1-rmse:2435.71009
[219]	validation_0-rmse:2294.67854	validation_1-rmse:2432.58052
[220]	validation_0-rmse:2284.55195	validation_1-rmse:2423.01225
[221]	validation_0-rmse:2280.90476	validation_1-rmse:2409.70469
[222]	validation_0-rmse:2267.28646	validation_1-rmse:2399.59270
[223]	validation_0-rmse:2260.18053	validation_1-rmse:2394.93130
[224]	validation_0-rmse:2236.98313	validation_1-rmse:2374.53571
[225]	validation_0-rmse:2203.40672	validation_1-rmse:2330.69678
[226]	validation_0-rmse:2190.68278	validation_1-rmse:2318.14091
[227]	validation_0-rmse:2160.62852	validation_1-rmse:2293.26404
[228]	validation_0-rmse:2157.24960	validation_1-rmse:2288.20779
[229]	validation_0-rmse:2145.96960	validation_1-rmse:2278.19584
[230]	validation_0-rmse:2117.18248	validation_1-rmse:2249.48009
[231]	validation_0-rmse:2106.91742	validation_1-rmse:2236.85756
[232]	validation_0-rmse:2102.16103	validation_1-rmse:2232.20228
[233]	validation_0-rmse:2085.90810	validation_1-rmse:2208.14650
[234]	validation_0-rmse:2064.88088	validation_1-rmse:2191.42052
[235]	validation_0-rmse:2051.18969	validation_1-rmse:2182.11604
[236]	validation_0-rmse:2046.99297	validation_1-rmse:2179.17784
[237]	validation_0-rmse:2033.82171	validation_1-rmse:2161.77707
[238]	validation_0-rmse:2024.52803	validation_1-rmse:2154.14445
[239]	validation_0-rmse:2021.68672	validation_1-rmse:2149.51149
[240]	validation_0-rmse:2009.00434	validation_1-rmse:2134.48556
[241]	validation_0-rmse:1994.01008	validation_1-rmse:2109.62881
[242]	validation_0-rmse:1962.96241	validation_1-rmse:2077.63265
[243]	validation_0-rmse:1947.43050	validation_1-rmse:2065.51839
[244]	validation_0-rmse:1922.57974	validation_1-rmse:2039.56017
[245]	validation_0-rmse:1921.46035	validation_1-rmse:2037.56382
[246]	validation_0-rmse:1904.35610	validation_1-rmse:2024.06746
[247]	validation_0-rmse:1887.26063	validation_1-rmse:2012.96955
[248]	validation_0-rmse:1866.06726	validation_1-rmse:1992.13959
[249]	validation_0-rmse:1860.43052	validation_1-rmse:1984.10128
[250]	validation_0-rmse:1849.97725	validation_1-rmse:1971.82636
[251]	validation_0-rmse:1841.07305	validation_1-rmse:1957.02083
[252]	validation_0-rmse:1821.48272	validation_1-rmse:1946.89375
[253]	validation_0-rmse:1811.06366	validation_1-rmse:1934.38007
[254]	validation_0-rmse:1791.98731	validation_1-rmse:1905.94924
[255]	validation_0-rmse:1768.64652	validation_1-rmse:1884.92830
[256]	validation_0-rmse:1764.68992	validation_1-rmse:1881.68876
[257]	validation_0-rmse:1751.09675	validation_1-rmse:1874.41176
[258]	validation_0-rmse:1745.63563	validation_1-rmse:1862.64000
[259]	validation_0-rmse:1742.98073	validation_1-rmse:1858.34865
[260]	validation_0-rmse:1735.07924	validation_1-rmse:1854.04390
[261]	validation_0-rmse:1721.83325	validation_1-rmse:1847.14197
[262]	validation_0-rmse:1704.69498	validation_1-rmse:1836.56036
[263]	validation_0-rmse:1692.55966	validation_1-rmse:1824.09266
[264]	validation_0-rmse:1681.00079	validation_1-rmse:1813.07919
[265]	validation_0-rmse:1673.55657	validation_1-rmse:1807.43314
[266]	validation_0-rmse:1668.76928	validation_1-rmse:1801.54249
[267]	validation_0-rmse:1658.67381	validation_1-rmse:1790.41968
[268]	validation_0-rmse:1643.68628	validation_1-rmse:1778.90677
[269]	validation_0-rmse:1641.99132	validation_1-rmse:1777.64264
[270]	validation_0-rmse:1630.98130	validation_1-rmse:1764.82075
[271]	validation_0-rmse:1624.59075	validation_1-rmse:1759.36291
[272]	validation_0-rmse:1619.69092	validation_1-rmse:1755.86429
[273]	validation_0-rmse:1617.42869	validation_1-rmse:1753.74214
[274]	validation_0-rmse:1615.15340	validation_1-rmse:1753.14444
[275]	validation_0-rmse:1605.16680	validation_1-rmse:1732.55274
[276]	validation_0-rmse:1591.67565	validation_1-rmse:1717.09022
[277]	validation_0-rmse:1580.68448	validation_1-rmse:1704.47816
[278]	validation_0-rmse:1578.48351	validation_1-rmse:1702.35605
[279]	validation_0-rmse:1567.05964	validation_1-rmse:1687.27992
[280]	validation_0-rmse:1556.74135	validation_1-rmse:1675.03154
[281]	validation_0-rmse:1540.59331	validation_1-rmse:1660.48204
[282]	validation_0-rmse:1533.36069	validation_1-rmse:1652.56217
[283]	validation_0-rmse:1515.09974	validation_1-rmse:1621.86201
[284]	validation_0-rmse:1509.72645	validation_1-rmse:1618.53310
[285]	validation_0-rmse:1498.58833	validation_1-rmse:1605.43154
[286]	validation_0-rmse:1483.39250	validation_1-rmse:1593.94866
[287]	validation_0-rmse:1473.14269	validation_1-rmse:1581.86674
[288]	validation_0-rmse:1448.77379	validation_1-rmse:1555.92206
[289]	validation_0-rmse:1434.13728	validation_1-rmse:1542.37626
[290]	validation_0-rmse:1420.79079	validation_1-rmse:1533.49019
[291]	validation_0-rmse:1416.97399	validation_1-rmse:1528.13185
[292]	validation_0-rmse:1408.02613	validation_1-rmse:1509.97963
[293]	validation_0-rmse:1395.23615	validation_1-rmse:1500.64092
[294]	validation_0-rmse:1388.85338	validation_1-rmse:1491.22366
[295]	validation_0-rmse:1376.29553	validation_1-rmse:1480.47880
[296]	validation_0-rmse:1361.11959	validation_1-rmse:1467.10018
[297]	validation_0-rmse:1351.77025	validation_1-rmse:1455.86577
[298]	validation_0-rmse:1336.17305	validation_1-rmse:1437.15565
[299]	validation_0-rmse:1330.44113	validation_1-rmse:1431.98815
[300]	validation_0-rmse:1323.03965	validation_1-rmse:1424.94016
[301]	validation_0-rmse:1315.51016	validation_1-rmse:1416.50926
[302]	validation_0-rmse:1306.90973	validation_1-rmse:1408.83388
[303]	validation_0-rmse:1297.79382	validation_1-rmse:1399.45831
[304]	validation_0-rmse:1287.55354	validation_1-rmse:1386.09859
[305]	validation_0-rmse:1281.68989	validation_1-rmse:1380.60793
[306]	validation_0-rmse:1267.36769	validation_1-rmse:1368.53405
[307]	validation_0-rmse:1259.46624	validation_1-rmse:1362.96065
[308]	validation_0-rmse:1255.70230	validation_1-rmse:1359.02235
[309]	validation_0-rmse:1249.99869	validation_1-rmse:1352.56653
[310]	validation_0-rmse:1240.27533	validation_1-rmse:1343.69005
[311]	validation_0-rmse:1234.82521	validation_1-rmse:1342.25331
[312]	validation_0-rmse:1226.01631	validation_1-rmse:1328.68791
[313]	validation_0-rmse:1216.57544	validation_1-rmse:1320.74652
[314]	validation_0-rmse:1207.44894	validation_1-rmse:1311.47812
[315]	validation_0-rmse:1203.07262	validation_1-rmse:1307.08728
[316]	validation_0-rmse:1191.61848	validation_1-rmse:1296.53328
[317]	validation_0-rmse:1187.06234	validation_1-rmse:1291.59549
[318]	validation_0-rmse:1184.69826	validation_1-rmse:1288.74536
[319]	validation_0-rmse:1180.82500	validation_1-rmse:1284.49831
[320]	validation_0-rmse:1168.09824	validation_1-rmse:1267.18429
[321]	validation_0-rmse:1163.17288	validation_1-rmse:1261.86886
[322]	validation_0-rmse:1160.47291	validation_1-rmse:1259.53987
[323]	validation_0-rmse:1157.41320	validation_1-rmse:1255.98255
[324]	validation_0-rmse:1133.68081	validation_1-rmse:1233.88486
[325]	validation_0-rmse:1124.38573	validation_1-rmse:1224.16454
[326]	validation_0-rmse:1117.47206	validation_1-rmse:1213.41030
[327]	validation_0-rmse:1110.09206	validation_1-rmse:1204.39935
[328]	validation_0-rmse:1102.90672	validation_1-rmse:1196.21123
[329]	validation_0-rmse:1091.75672	validation_1-rmse:1186.33095
[330]	validation_0-rmse:1084.62568	validation_1-rmse:1181.03279
[331]	validation_0-rmse:1082.99572	validation_1-rmse:1179.03392
[332]	validation_0-rmse:1071.58113	validation_1-rmse:1165.10769
[333]	validation_0-rmse:1067.70668	validation_1-rmse:1158.62258
[334]	validation_0-rmse:1062.01345	validation_1-rmse:1148.49476
[335]	validation_0-rmse:1057.27704	validation_1-rmse:1140.28387
[336]	validation_0-rmse:1047.57043	validation_1-rmse:1127.44191
[337]	validation_0-rmse:1038.03755	validation_1-rmse:1114.48357
[338]	validation_0-rmse:1035.80113	validation_1-rmse:1113.13924
[339]	validation_0-rmse:1030.70894	validation_1-rmse:1108.99230
[340]	validation_0-rmse:1022.20406	validation_1-rmse:1096.74973
[341]	validation_0-rmse:1018.35974	validation_1-rmse:1092.51175
[342]	validation_0-rmse:1015.01250	validation_1-rmse:1087.15910
[343]	validation_0-rmse:1010.02578	validation_1-rmse:1082.52730
[344]	validation_0-rmse:1006.72764	validation_1-rmse:1079.56225
[345]	validation_0-rmse:1000.66664	validation_1-rmse:1072.75425
[346]	validation_0-rmse:992.74489	validation_1-rmse:1066.47673
[347]	validation_0-rmse:982.13268	validation_1-rmse:1054.10551
[348]	validation_0-rmse:977.26441	validation_1-rmse:1050.16953
[349]	validation_0-rmse:972.57276	validation_1-rmse:1042.87541
[350]	validation_0-rmse:967.36103	validation_1-rmse:1038.70665
[351]	validation_0-rmse:957.43327	validation_1-rmse:1026.40027
[352]	validation_0-rmse:955.11067	validation_1-rmse:1021.85494
[353]	validation_0-rmse:950.62209	validation_1-rmse:1013.36173
[354]	validation_0-rmse:941.97329	validation_1-rmse:1000.90180
[355]	validation_0-rmse:936.12116	validation_1-rmse:992.94661
[356]	validation_0-rmse:927.65517	validation_1-rmse:983.76466
[357]	validation_0-rmse:924.08591	validation_1-rmse:979.54978
[358]	validation_0-rmse:912.88565	validation_1-rmse:964.89354
[359]	validation_0-rmse:903.46344	validation_1-rmse:951.00396
[360]	validation_0-rmse:893.75225	validation_1-rmse:940.71555
[361]	validation_0-rmse:889.08104	validation_1-rmse:936.89416
[362]	validation_0-rmse:884.30825	validation_1-rmse:931.27754
[363]	validation_0-rmse:882.05518	validation_1-rmse:929.84983
[364]	validation_0-rmse:875.42699	validation_1-rmse:925.35389
[365]	validation_0-rmse:869.69639	validation_1-rmse:918.19321
[366]	validation_0-rmse:862.71045	validation_1-rmse:907.50560
[367]	validation_0-rmse:856.67119	validation_1-rmse:901.19355
[368]	validation_0-rmse:845.07229	validation_1-rmse:887.55682
[369]	validation_0-rmse:837.95182	validation_1-rmse:879.44621
[370]	validation_0-rmse:830.01503	validation_1-rmse:871.37923
[371]	validation_0-rmse:823.60766	validation_1-rmse:864.52640
[372]	validation_0-rmse:816.05077	validation_1-rmse:855.78544
[373]	validation_0-rmse:810.07692	validation_1-rmse:848.16686
[374]	validation_0-rmse:808.93300	validation_1-rmse:846.75750
[375]	validation_0-rmse:800.17793	validation_1-rmse:835.96411
[376]	validation_0-rmse:787.93600	validation_1-rmse:827.75105
[377]	validation_0-rmse:777.00199	validation_1-rmse:817.75945
[378]	validation_0-rmse:770.71053	validation_1-rmse:809.86075
[379]	validation_0-rmse:766.72312	validation_1-rmse:805.72022
[380]	validation_0-rmse:755.65324	validation_1-rmse:795.34884
[381]	validation_0-rmse:749.50056	validation_1-rmse:788.56745
[382]	validation_0-rmse:746.50473	validation_1-rmse:787.75603
[383]	validation_0-rmse:742.82023	validation_1-rmse:782.60412
[384]	validation_0-rmse:738.94763	validation_1-rmse:776.24942
[385]	validation_0-rmse:729.17302	validation_1-rmse:766.97505
[386]	validation_0-rmse:722.02748	validation_1-rmse:760.42082
[387]	validation_0-rmse:716.54785	validation_1-rmse:756.27456
[388]	validation_0-rmse:708.01784	validation_1-rmse:743.35793
[389]	validation_0-rmse:699.95427	validation_1-rmse:732.85032
[390]	validation_0-rmse:691.90597	validation_1-rmse:726.66872
[391]	validation_0-rmse:681.63380	validation_1-rmse:715.42713
[392]	validation_0-rmse:676.59932	validation_1-rmse:710.77263
[393]	validation_0-rmse:671.99680	validation_1-rmse:704.96396
[394]	validation_0-rmse:666.96151	validation_1-rmse:700.52898
[395]	validation_0-rmse:662.54533	validation_1-rmse:695.08182
[396]	validation_0-rmse:659.13916	validation_1-rmse:693.22702
[397]	validation_0-rmse:654.47900	validation_1-rmse:688.70765
[398]	validation_0-rmse:653.34493	validation_1-rmse:686.16588
[399]	validation_0-rmse:650.42020	validation_1-rmse:682.55256
[400]	validation_0-rmse:646.33932	validation_1-rmse:678.01783
[401]	validation_0-rmse:640.42195	validation_1-rmse:672.65315
[402]	validation_0-rmse:635.11308	validation_1-rmse:666.22477
[403]	validation_0-rmse:632.10842	validation_1-rmse:661.91357
[404]	validation_0-rmse:629.50165	validation_1-rmse:656.38737
[405]	validation_0-rmse:627.45671	validation_1-rmse:652.96017
[406]	validation_0-rmse:624.03807	validation_1-rmse:649.73062
[407]	validation_0-rmse:617.42381	validation_1-rmse:644.83905
[408]	validation_0-rmse:614.24998	validation_1-rmse:639.85870
[409]	validation_0-rmse:611.93487	validation_1-rmse:636.68713
[410]	validation_0-rmse:607.65657	validation_1-rmse:632.56294
[411]	validation_0-rmse:606.62828	validation_1-rmse:631.71146
[412]	validation_0-rmse:603.39935	validation_1-rmse:629.22777
[413]	validation_0-rmse:599.40693	validation_1-rmse:626.65971
[414]	validation_0-rmse:596.39313	validation_1-rmse:624.58614
[415]	validation_0-rmse:594.29236	validation_1-rmse:622.89066
[416]	validation_0-rmse:586.78096	validation_1-rmse:615.23934
[417]	validation_0-rmse:583.50411	validation_1-rmse:612.68285
[418]	validation_0-rmse:581.77838	validation_1-rmse:611.38113
[419]	validation_0-rmse:576.97472	validation_1-rmse:604.96383
[420]	validation_0-rmse:573.87582	validation_1-rmse:600.74450
[421]	validation_0-rmse:572.53262	validation_1-rmse:598.14705
[422]	validation_0-rmse:570.83916	validation_1-rmse:594.98777
[423]	validation_0-rmse:567.79471	validation_1-rmse:591.01712
[424]	validation_0-rmse:565.66300	validation_1-rmse:590.39107
[425]	validation_0-rmse:562.96944	validation_1-rmse:587.08557
[426]	validation_0-rmse:557.60191	validation_1-rmse:582.31899
[427]	validation_0-rmse:552.97352	validation_1-rmse:578.72870
[428]	validation_0-rmse:550.49260	validation_1-rmse:576.76071
[429]	validation_0-rmse:549.09160	validation_1-rmse:576.51502
[430]	validation_0-rmse:545.82663	validation_1-rmse:572.66745
[431]	validation_0-rmse:540.90323	validation_1-rmse:565.74622
[432]	validation_0-rmse:538.59316	validation_1-rmse:563.20662
[433]	validation_0-rmse:532.11475	validation_1-rmse:558.16768
[434]	validation_0-rmse:528.36994	validation_1-rmse:552.56197
[435]	validation_0-rmse:520.83351	validation_1-rmse:546.92176
[436]	validation_0-rmse:516.18431	validation_1-rmse:543.95998
[437]	validation_0-rmse:511.98488	validation_1-rmse:538.00930
[438]	validation_0-rmse:508.44241	validation_1-rmse:534.34557
[439]	validation_0-rmse:502.83682	validation_1-rmse:529.92686
[440]	validation_0-rmse:493.38655	validation_1-rmse:522.77181
[441]	validation_0-rmse:490.39778	validation_1-rmse:520.31740
[442]	validation_0-rmse:488.68930	validation_1-rmse:519.45998
[443]	validation_0-rmse:482.78711	validation_1-rmse:512.53349
[444]	validation_0-rmse:477.81279	validation_1-rmse:507.84852
[445]	validation_0-rmse:476.79698	validation_1-rmse:506.80217
[446]	validation_0-rmse:473.44217	validation_1-rmse:501.95119
[447]	validation_0-rmse:469.94351	validation_1-rmse:498.80037
[448]	validation_0-rmse:466.52580	validation_1-rmse:494.07820
[449]	validation_0-rmse:463.85273	validation_1-rmse:492.25921
[450]	validation_0-rmse:461.77793	validation_1-rmse:488.52140
[451]	validation_0-rmse:460.09441	validation_1-rmse:486.82196
[452]	validation_0-rmse:456.43502	validation_1-rmse:482.08783
[453]	validation_0-rmse:454.93399	validation_1-rmse:479.07482
[454]	validation_0-rmse:449.78025	validation_1-rmse:475.97138
[455]	validation_0-rmse:448.81008	validation_1-rmse:475.91944
[456]	validation_0-rmse:447.31621	validation_1-rmse:473.79997
[457]	validation_0-rmse:445.62529	validation_1-rmse:470.65076
[458]	validation_0-rmse:444.56876	validation_1-rmse:469.32120
[459]	validation_0-rmse:443.37948	validation_1-rmse:468.04406
[460]	validation_0-rmse:437.79994	validation_1-rmse:461.01157
[461]	validation_0-rmse:435.00399	validation_1-rmse:458.15872
[462]	validation_0-rmse:434.27398	validation_1-rmse:457.51142
[463]	validation_0-rmse:433.06713	validation_1-rmse:456.27676
[464]	validation_0-rmse:430.01744	validation_1-rmse:454.07662
[465]	validation_0-rmse:428.17370	validation_1-rmse:451.83303
[466]	validation_0-rmse:423.40895	validation_1-rmse:447.07227
[467]	validation_0-rmse:419.93371	validation_1-rmse:444.64360
[468]	validation_0-rmse:418.94333	validation_1-rmse:443.18982
[469]	validation_0-rmse:418.26987	validation_1-rmse:442.65504
[470]	validation_0-rmse:416.23983	validation_1-rmse:440.53838
[471]	validation_0-rmse:414.52364	validation_1-rmse:439.21531
[472]	validation_0-rmse:409.94415	validation_1-rmse:435.22991
[473]	validation_0-rmse:406.25062	validation_1-rmse:431.36180
[474]	validation_0-rmse:404.27954	validation_1-rmse:429.31406
[475]	validation_0-rmse:400.88409	validation_1-rmse:426.24500
[476]	validation_0-rmse:400.08360	validation_1-rmse:425.26621
[477]	validation_0-rmse:397.39967	validation_1-rmse:422.59442
[478]	validation_0-rmse:395.69525	validation_1-rmse:420.31330
[479]	validation_0-rmse:392.79927	validation_1-rmse:416.41740
[480]	validation_0-rmse:388.90586	validation_1-rmse:413.38713
[481]	validation_0-rmse:382.25669	validation_1-rmse:407.56848
[482]	validation_0-rmse:378.53509	validation_1-rmse:403.66055
[483]	validation_0-rmse:377.90080	validation_1-rmse:403.13032
[484]	validation_0-rmse:376.27100	validation_1-rmse:399.72091
[485]	validation_0-rmse:375.01688	validation_1-rmse:397.87453
[486]	validation_0-rmse:373.11877	validation_1-rmse:395.31981
[487]	validation_0-rmse:372.58348	validation_1-rmse:395.05127
[488]	validation_0-rmse:371.62051	validation_1-rmse:394.09473
[489]	validation_0-rmse:368.24673	validation_1-rmse:391.53530
[490]	validation_0-rmse:366.81173	validation_1-rmse:390.90348
[491]	validation_0-rmse:365.16926	validation_1-rmse:388.46630
[492]	validation_0-rmse:362.47711	validation_1-rmse:383.84195
[493]	validation_0-rmse:361.54045	validation_1-rmse:382.70691
[494]	validation_0-rmse:358.30914	validation_1-rmse:379.42996
[495]	validation_0-rmse:355.68943	validation_1-rmse:376.98049
[496]	validation_0-rmse:354.61412	validation_1-rmse:376.25955
[497]	validation_0-rmse:353.97453	validation_1-rmse:375.48551
[498]	validation_0-rmse:349.76659	validation_1-rmse:371.93303
[499]	validation_0-rmse:348.06015	validation_1-rmse:367.30796
RMSE Score: 367.3079611134532
Your submission was successfully saved!
No description has been provided for this image
In [ ]:
explainer = shap.TreeExplainer(model)

shap_values = explainer.shap_values(X_val)

shap.summary_plot(shap_values, X_val)
shap.force_plot(explainer.expected_value, shap_values[0], X_val.iloc[0])
No description has been provided for this image
Out[ ]:
Visualization omitted, Javascript library not loaded!
Have you run `initjs()` in this notebook? If this notebook was from another user you must also trust this notebook (File -> Trust notebook). If you are viewing this notebook on github the Javascript has been stripped for security. If you are using JupyterLab this error is because a JupyterLab extension has not yet been written.